Case Study 1: Audience Size

Executive Summary

The intent of this executive summary is to answer two questions.

  1. Who listens to Sirius Radio?
  2. Who listens to Sirius Business Radio by Wharton?
  3. Does this samples appear to be a random sample from the general population?
  4. Does this sample appear to be a random sample from the MTURK population?

Based on the responses of these answers we move forward with the following implications.

  1. What method should be utilized to estimate the audience size?
  2. what data should be collected and where should it come from?

The variables that were selected for analysis were, age, education, gender, income, sirius, wharton, and worktime. Survey questions that did not have a response were recoded to NA. Survey questions that were input in the incorrect format, were changed to be in the appropriate format. For example, age should have been a numeric input; however, some respondents wrote out their age that being ‘eighteen’ rather than 18. Other examples of incorrect information includes typing 223. The age 223, is far beyond the life span of a normal human. It cannot be assumed that the individual intended to type 22, 23, or 32. As such, this response was changed to NA.

Summary Tables

  1. Who listens to Sirius Radio?
  2. Who listens to Sirius Business Radio by Wharton?
  3. Does this samples appear to be a random sample from the general population?
  4. Does this sample appear to be a random sample from the MTURK population?

The first four questions can be responded to by analyzing the following figures.

Educational attainment

The summary table is split by educational attainment and shows that most listeners have some college diploma or has a bachelor’s degree. For the individuals who own a bachelor’s degree, the average age of the listener is 28.5 while the median is 26. For individuals who hold a bachelors degree, the mean age is 31.14 while the median is 28.

Education Frequency Mean_Age Median_Age Mean_Worktime Median_Worktime
Less than 12 years; no high school diploma 10 30.00000 27.0 23.90000 22.0
High school graduate (or equivalent) 191 30.21466 27.0 23.32984 21.0
Some college, no diploma; or Associates degree 737 28.51967 26.0 22.49389 21.0
Bachelors degree or other 4-year degree 612 31.14379 28.0 21.96078 20.0
Graduate or professional degree 177 34.57627 31.0 23.22034 21.0
Other 2 50.50000 50.5 46.50000 46.5

Age

For a closer look at the age of listeners a frequency table has been included to show the gender breakdown with the mean and median age of male and female listeners.

Gender Frequency Mean_Age Median_Age Mean_Worktime Median_Worktime
Female 729 31.9561 29 23.52812 21
Male 1000 29.0750 27 21.76400 20

Histogram of age which is split by gender and income also show the breakdown of age.

Income

Income Frequency Mean_Age Median_Age Mean_Worktime Median_Worktime
Less than $15,000 206 26.57767 24 21.19903 20.0
$15,000 - $30,000 360 29.45833 27 23.09167 21.0
$30,000 - $50,000 421 30.73634 28 22.85036 20.0
$50,000 - $75,000 372 31.46774 29 22.15860 20.0
$75,000 - $150,000 326 31.59202 30 22.80368 21.0
Above $150,000 44 30.59091 24 21.34091 20.5

Wharton and Sirius?

Box Plots

The box plot, below faceted by educational attainment and income and specifically describes the users who listen to both Wharton radio and Sirius radio. Each of the boxplots are split by gender and describe the age of each listener by gender.

Conclusion: Who listens to Wharton Radio?

Based on the charts above, the age of the average listeners tends to be mid 20’s to early 30’s. Usually they will have some educational background of some sort and more often tend to be male than female. Generally these individuals will be making $30,000 - $50,000 a year. Knowing the general audience base, the next two questions can be responded to.

  1. What method should be utilized to estimate the audience size?
  2. what data should be collected and where should it come from?

Method for estimating Audience Size?

What data should be collected and where should it come from? Rather than administering dat from MTURK, a more direct approach to listener base can be found by pulling data from stitcher which has been aquired by Sirius XM. This data may provide more specific …

Case Study 2: Women in Science

Questions

  1. How many fields?
  2. type of degrees?
  3. years of statistics being reported

In this data set there are 10 different fields of study, 3 different degree levels and 11 years worth of data from 2006 - 2016. This can be seen in the summary tables below.

By Field and Sex

This table shows that 10 different fields of study and the total amount of Male and Female people who studied the field. Overall looking at the 10 fields, there is an even breakdown of Female dominated fields to Male dominated fields.

Field Male Female More
Agricultural sciences 152956 172852 Female
Biological sciences 516556 745384 Female
Computer sciences 626248 171808 Male
Earth, atmospheric, and ocean sciences 53118 36076 Male
Engineering 1147112 301426 Male
Mathematics and statistics 173641 125083 Male
Non-S&E 7356720 11916324 Female
Physical sciences 194365 120797 Male
Psychology 326865 1109902 Female
Social sciences 1041377 1244592 Female

By Degree and Sex

This table shows the 3 different types of degrees over the course of 11 years with sex being broken down. The data shows that although there are more female people collectively receiving Bachelors degrees and Masters degrees, Male people attain more Ph.Ds.

Degree Male Female More
BS 8137972 10930199 Female
MS 3104414 4669113 Female
PhD 346572 344932 Male

By Year and Sex

Finally the last table shows the amount of degrees attained each year by male and females. Showing that overall, female people overall earn more degrees and that there has been a stead increase of degrees over time.

Year Male Female More
2006 904679 1253917 Female
2007 925621 1287439 Female
2008 953360 1320480 Female
2009 985411 1360820 Female
2010 1019514 1404646 Female
2011 1063992 1466539 Female
2012 1107721 1525402 Female
2013 1130821 1552075 Female
2014 1147769 1570559 Female
2015 1163164 1586060 Female
2016 1186906 1616307 Female

BS degrees in 2015

The summary statistics tables above are high level summaries. To understand the breakdown of science related fields vs non-science fields in 2015, separate bar plots have been made.

2015 Overall Science and Engineering This bar plot shows the breakdown of science and engineering fields and compares them to the broader category of non-science and engineering fields.

To show the difference in sex between Non-science and engineering fields and science and engineering fields in 2015 the following bar plot has been made. As the tables above show generally there are more woman obtaining degrees, which is highlighted by the larger Non-S&E bar. However; despite when comparing S&E, the bar for Male and Female, are almost similar in frequency.

Questions

In general, the summary tables show that there female people are overall earning more degrees; however, when broken down by science-related fields Female people generally study, social sciences, psychology, biology, and agricultural sciences. Male people study physical sciences, mathematics and statistics, engineering, earth, atmospheric, and ocean sciences, and computer Science. This can be found in the box plot below. This is consistent with literature that shows that these fields are often not inclusive of woman, and in particular women of color.

To see if woman are in the field of data science, the plot has been filtered to only include, computer science, math and statistics. Despite the foundations that woman have established in the fields of mathematics, statistics and computer science, woman are underrepresented in both fields.

All Variables

Case Study 3: Major League Baseball

Team Change_Payrole Payrole Wins Wins_Pct
Los Angeles Dodgers 0.8511002 149.15418 434 0.5364313
Washington Nationals 0.8200017 91.04086 429 0.5302431
San Diego Padres 0.7443947 59.23019 390 0.4814815
Texas Rangers 0.6839576 103.63741 437 0.5388169
San Francisco Giants 0.6294727 125.62321 436 0.5382716

Yearly payroll or yearly increase in payroll